Skip to content

[PLT-1008] Export content parses incorrectly #1628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 27, 2024

Conversation

adrian-chang
Copy link
Contributor

@adrian-chang adrian-chang commented May 24, 2024

  • Introduce Buffered Stream
  • Prep deprecation of non-buffered methods
  • Fixup errors / result in export.task to use bufferedstream

vbrodsky
vbrodsky previously approved these changes May 24, 2024
Copy link
Contributor

@vbrodsky vbrodsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the approach and agree with it. The code looks right, not enough deep understanding on my end to dig into each and every line.

@adrian-chang adrian-chang changed the title [PLT-0] fix bug meta's [PLT-1008] Export content parses incorrectly May 24, 2024
@adrian-chang adrian-chang force-pushed the achang/plt-0-fix-bug-meta branch from 28d672c to 8bfe14c Compare May 25, 2024 07:11
@adrian-chang adrian-chang force-pushed the achang/plt-0-fix-bug-meta branch from d75ce34 to 705275b Compare May 26, 2024 05:34
@Gabefire
Copy link
Collaborator

Are you technically deprecating the FileConverterType with this change since the buffered stream does not look like you can pass custom converters?

@Gabefire
Copy link
Collaborator

NIT: one last thing: would it make sense to change the default value for the convertor in the get_stream method to the buffered version so the deprecation messages do not fire twice/ clean up the method easily? This is for this workflow.

@adrian-chang
Copy link
Contributor Author

adrian-chang commented May 27, 2024

Are you technically deprecating the FileConverterType with this change since the buffered stream does not look like you can pass custom converters?

Sort of. I think the way it is written is a bit incorrect (yielding a file constantly is very odd in my view) but the main idea of yielding an entire file is correct.

The Export_Task code is very over engineered and buggy and needs to be simplified.

We should aim to give customers basically 3 outputs: fully outputted in memory, line by line in memory, or an entire file by disk and that's it.

Allowing for a lot of customization such as choosing what chunk to export at (offset) or line makes etc. just makes our life more difficult maintaining the SDK.

@adrian-chang
Copy link
Contributor Author

adrian-chang commented May 27, 2024

NIT: one last thing: would it make sense to change the default value for the convertor in the get_stream method to the buffered version so the deprecation messages do not fire twice/ clean up the method easily? This is for this workflow.

No. BufferedJsonOutput has a schema of { json: any } versus { json_str: str }. Changing the convertor is a potential breaking change.

This is also why buffered_stream was introduced.

@adrian-chang adrian-chang merged commit 80c730c into develop May 27, 2024
22 checks passed
@adrian-chang adrian-chang deleted the achang/plt-0-fix-bug-meta branch May 27, 2024 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants